# Lightweight Pretraining
Arsh Llm
MIT
Arsh LLM is an open-source large language model designed for research, pretrained on the olmo mixed dataset using a T4 GPU, with a total training time of approximately 4-5 days.
Large Language Model
A
arshiaafshani
162
3
Tinymistral 248M
Apache-2.0
A language model scaled down from Mistral 7B to 248 million parameters, designed for text generation tasks and suitable for downstream task fine-tuning.
Large Language Model
Transformers English

T
Locutusque
1,127
46
Sew Tiny 100k
Apache-2.0
SEW-tiny is a compressed and efficient speech pretraining model developed by ASAPP Research, pretrained on 16kHz sampled speech audio, suitable for various downstream speech tasks.
Speech Recognition
Transformers Supports Multiple Languages

S
asapp
1,080
3
Bert L12 H256 A4
A lightweight BERT model pretrained using knowledge distillation techniques, with a hidden layer dimension of 256 and 4 attention heads, suitable for masked language modeling tasks.
Large Language Model
Transformers

B
eli4s
17
0
Mengzi Oscar Base Caption
Apache-2.0
A Chinese multimodal image captioning model fine-tuned on the AIC-ICC Chinese image caption dataset, based on the Mengzi-Oscar pretrained model
Image-to-Text
Transformers Chinese

M
Langboat
23
2
Bert Base Arabic Camelbert Msa Sixteenth
Apache-2.0
Pretrained model for Arabic NLP tasks, trained on a reduced-scale (1/16) Modern Standard Arabic (MSA) dataset
Large Language Model Arabic
B
CAMeL-Lab
215
4
Featured Recommended AI Models